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Abstract. We show that the quadratic matrix equation 
VW + r](W)W = I, for given V with positive real part and given 
analytic mapping r\ with some positivity preserving properties, has 
exactly one solution W with positive real part. Also we provide 
and compare numerical algorithms based on the iteration underly- 
ing our proofs. 

This work bears on operator-valued free probability theory, in 
particular on the determination of the asymptotic eigenvalue dis- 
tribution of band or block random matrices. 



1. Introduction 

This paper does two things. One concerns free probability and ran- 
dom matrix theory. The other concerns iterative methods for solving 
matrix equations of the form 

(1.1) VW + r](W)W = I, 

where V is a given matrix with positive real part, rj is a 'positivity 
preserving' linear or analytic mapping, and we are looking for a solution 
W with positive real part. 

The second topic bears directly on the first. Our main motivation for 
considering this kind of equation comes from free probability theory. 
We will now briefly review the relevance of this equation in the free 
probabilistic context of operator-valued semicircular elements. This 
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will only serve as a motivation and is not essential for our main state- 
ments about solving Eq. (II .ip in section 2. For a general introduction 
to free probability theory, see, e.g., [6]. 

Operator- valued semicircular elements [T2l[Tl] play an important role 
in free probability and random matrix theory In particular, a big class 
of random matrices (band matrices, block matrices) can asymptotically 
be described by such elements. This fundamental observation was made 
by Shlyakhtenko in [10J; extensions and the relevance of this from the 
point of view of electrical engineering problems were treated in [5]. 

The main problem in a random matrix context is the determination 
of the asypmptotic eigenvalue distribution. Let us denote by H(z) the 
Cauchy transform of the asymptotic eigenvalue distribution /x, given 
by 

H{z) := / —Mt). 

This is an analytic function in the upper complex half plane 

C+ := {z G C I Imz > 0} 
and allows to recover /i via the Stieltjes inversion formula 

dfiit) = limlm Hit + is). 

IT e^O 

The theory of operator-valued semicircular elements tells us that one 
can get this Cauchy transform as 

H(z) = <p{G(z)), 

where G(z) is an operator-valued function, i.e. G(z) G A for some 
operator algebra A (with involution *, containing the unit J), and 
where ip : A — > C is a given state on this algebra. (The data A and ip 
are determined by the form of the considered matrices; e.g., for block 
matrices as in [9], A = M</(C) are just the d x rf-matrices, for some 
fixed d, and ip = tr^ is the normalized trace on those matrices.) The 
operator-valued Cauchy transform G(z) is determined by the equation 

(1.2) zG(z)=I + rj{G(z))G(z) (z G C + ), 

together with its asymptotic behaviour G(z) ~ -I for z — > oo. Here 
we are given the completely positive linear map i] : A ^ A; it contains 
the information about the covariance of the considered random matrix 
or operator- valued semicircular element. 

One of the main problems, both from a conceptual and a numerical 
point of view, is that, for a fixed z G C + , the equation (11.21) has not 
just one, but many solutions. To isolate the correct root is not obvious 
at all. 



OPERATOR- VALUED SEMICIRCULAR ELEMENTS 



3 



We shall use the common notation: the real and imaginary parts of 
an operator W are defined as 

Re W := ~{W + W*) ImW := —(W-W*). 

Since G is also the operator- valued Cauchy transform of a semicircular 
element s, 

(for some C7*-algebra B with s G B, s = s*, and a conditional expecta- 
tion E : B — ► A), we have for Im z > that 

ImG(z) = i(G(z)-G(z)*) 



2i z — s z — s 

= -Imz'M _ 1 1 

z — s z — s 

Im 2! 

< o • 

~ (M + MI) 2 

since 

1 1 1 

> 9 

z-sz-s (| z | + || s ||) 2 

and a conditional expectation is positive. 

This means that we are looking, for any z 6 C + , for a solution G(z) 
of fll.2p which has the property that its imaginary part is negative and 
stays away from zero (which is the same as saying that its imaginary 
part is negative and invertible). Here we will show that actually there 
exists exactly one solution of (II. 2p with this property, and that one can 
get it by an appropriately chosen iteration procedure. 

If we write G(z) = — iW(z), then ImG(z) < —el gets replaced 
by the nicer condition KeW(z) > el. Thus we will work with W(z) 
instead of G(z) in the following. The equation (11.21) reads in terms of 
W as 

(1.3) -izW(z) +r](W(z))W(z) = I 

As it turns out the linearity of rj is not crucial (as long as it is analytic, 
and has certain positivity and boundedness properties). Thus we will 
treat the problem directly in this more general setting. Furthermore, 
one can also replace the complex number z G C + by an arbitrary 
element Z G A with positive imaginary part. Again we replace this by 



4 W. HELTON, R. RASHIDI FAR, AND R. SPEICHER 

the condition of positive real part by going over to V = —iZ, ending 
up with the Eq. (11.11) . 

2. Main Theorem 

2.1. Setting of our problem. We work in a C*-algebra, denoted A. 
Interesting examples arise already in the finite-dimensional case, thus 
the reader might just take A as a matrix algebra of complex d x d- 
matrices, A = M d (C) for some d G N. 

For a self adjoint operator A we mean with A > that its spectrum 
is contained in [0, oo). Let A + denote the strict right half plane of A, 

A + := {W G A | Re W > el for some e > }. 

Note that W G A + if and only if Re W > and Re W is invertible. 
Furthermore, we are given an analytic mapping 

r] : A+ -> A+, 

which is bounded on bounded domains in A + , i.e. 

sup{||77(WO|| | W G A+, \\W\\ <r}<oo 

for any r > 0. 

For a given V G A + we consider our key equation: 

(2.1) VW + rj(W)W = I. 

We are looking for a solution W G A+. 

Note that the solution of this equation is the same as a fixed point 
of the map W i— > J-'viW), where 

(2.2) F V (W):=[V + r 1 {W)}-\ 
Here is our main result. 

Theorem 2.1. For fixed V G A+, there exists exactly one solution 
W G A+ to (12. ID ; i/izs zs i/ie /z'mz't o/ iterates 

W n = ^(W ) 

for any W G ^4 + . Furthermore, we have that 

\\W\\ < IKReV)- 1 ]] 

and 

ReW > ' 



m 2 ■ || (Re T/)- 1 1| ' 
w/iere 

m ■= \\V\\ + Bup{||^(lV)|| \ WeA+, \\W\\ < \\(ReVy 
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Remark 2.2. The uniqueness part of our theorem can be used to give 
an easy direct proof of Prop. 5.6 in [3]. There two operator- valued 
Cauchy transforms G(X) and G n (X) are considered (where A might 
also be a matrix, with positive imaginary part), and it is shown that 
both G n (X) and G((A n (A)) fulfill the same equation, which is, after our 
rescaling, of the form (12. ip . Since both G(A„(A)) and G n (X) satisfy (as 
Cauchy transforms at some value of the argument) the right positivity 
condition, they both must agree with the unique solution, given by our 
theorem, and thus G(A n (X)) = G n (X). 

2.2. Related Topics. Suppose rj maps selfadjoint elements to self- 
adjoint elements; then it maps positive elements SA+ of A to SA+ . 
Suppose further that V > 0. Then, if r] is linear, the map Ty restricted 
to SA + is a monotone decreasing map, that is, if W\ > W 2 > 0, then 
FyiWi) < TyiyV-z). There are results in [8] and [I] yielding fixed points 
in this case with proofs quite different than here. A list of applications 
and existing results on special cases is in these papers. 

3. Contraction Maps and Proofs 

We will prove our theorem by applying Banach's fixed point theorem 
to the map Ty. One should note that Ty is usually not a contraction 
in the given operator norm on A + . Analyticity of our mapping, how- 
ever, provides us with another metric in which we have the contraction 
property. 

On A+, since it is a domain in a Banach space, is the well known 
Caratheodory metric, one of the biholomorphically invariant metrics on 
A+. See for [2] for an excellent exposition of this and material germain 
to our treatment here. The crucial point is that strict holomorphic 
mappings on such domains are automatically strict contractions in this 
metric, and thus Banach's fixed point theorem guarantees a unique 
fixed point of such mappings. For the reader's convenience we recall 
here the relevant theorem, due to Earle and Hamilton [TJ. 

Theorem 3.1. (Earle-Hamilton) Let D be a nonempty domain in a 
complex Banach space X and let h : T> — > T> be a bounded holomorphic 
function. If h(V) lies strictly inside V (i.e., there is some e > such 
that B e (h(x)) C V, whenever x G V, where B e (y) is the ball of radius 
e about y ), then h is a strict contraction in the Caratheodory-Riffen- 
Finsler metric p, and thus has a unique fixed point in T>. Furthermore, 
one has for all x,y G T> that p{x,y) > m\\x — y\\ for some constant 
m > 0, and thus (/i n (xo))neN converges in norm, for any x G V, to 
this fixed point. 
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We will now apply this to our situation. The main point will be to 
check that Ty is well-defined and maps suitably chosen subsets Rb C 
A + strictly into itself. (Note that we do not claim that Ty is a strict 
contraction on A + itself.) 
For b > 0, let us define 



Rb := {WeA+\ \\W\\ < b} c A+ 

Proposition 3.2. (1) Any fixed point W of Ty satisfies Eq. ( 12. II) . 

(2) If Rer](W) > and V G A+, then V + rj(W) is mvertible 
in A, thus Ty(W) G A is well-defined, and ReTy(W) > 0. 
Furthermore, we have 

\\t v (W)\\ < IKRey)- 1 !!. 

(3) For V G A + and b > ||(Re V^) — 1 1 1 > ; the map Ty is a bounded 
holomorphic map on Rb and takes Rb strictly into its interior. 
We have for W G R b that 



ReT v (W) > 



1 



m 2 b - IKReV)- 1 !! 
with 

m b := \\V\\ + sup{||t7(W0|| | W G A + , \\W\\ < b}. 

(4) Ty maps, for any V G A + , the strict upper half plane of A into 
itself, 

Ty : A+^> A+. 
It is a bounded holomorphic map there. 

Proof. (1) This is obvious. 

(2) Let A and B be the real and imaginary part of rj(W), respectively, 
i.e., r)(W) = A + iB with A = A* and B = B* . By our assumption, 
A > 0. Put K := V + r)(W). Then we have: 

ReK = ReV + A > ReV. 

Since V G A+, the real part of K is bounded away from zero by a 
multiple of identity, and thus K is invertible as a bounded operator. 
Furthermore, the norm of the inverse K~ x = Ty(W) can be bounded 
by || (Re V) -1 1| . (For more details about these statements, see Lemma 
3.1 in [3].) 



OPERATOR- VALUED SEMICIRCULAR ELEMENTS 



7 



To see the positivity of the real part of the inverse, we calculate: 
2 • ReF v {W) = [V + A + iB}~ 1 + [V + A + iB}*' 1 

= [V + A + iB]-\2A + V + V*)[V* + A - iB}' 1 
= 2F v (W)(A + ReV)Fv(W)* 

> 2F V (W) -ReV-F v (W)* 

> 0. 

(3) The map W i— > K(W) := V + rj(W) is analytic by the analyticity 
of T], and Ty^W) = K{W)~ l exists - and is thus also analytic — for W G 
A + , by (2) (note that the fact that rj preserves the positivity of the real 
part implies that the assumptions of (2) are satisfied). Furthermore, 
by the norm estimate from (2), we have for all W £ Rb that 

(3.i) ll-Mwou < IKRe^)- 1 !! <b. 

Thus Ty is a bounded holomorphic map on Rf,, with image in Rf,. To 
see that the image lies strictly in Rb, we have to see that Fv(W) stays 
away from the boundary of Rb by some e-amount. By (13.11) we stay 
away from the boundary \\W\\ = b by at least (b — ||(Rey) _1 ||). It 
remains to consider the part of the boundary described by ReW = 0. 
By refining the last inequality of the calculation from (2) in our present 
setting we have 

Re T V {W) > F V (W) ■ ReV • F V (W)* 

= [K(W)* ■ (ReV)' 1 -KiW)}- 1 
1 



> 



> 



\\K(W)* ■ (ReV)- 1 -K(W)\\ 
1 



m 2 b ■ || (Re V)- 1 1| 

since ||X(M' r )|| < nib for ah W £ Rb- Thus we stay away from the 
boundary ReW = by at least • || (Re V)' 1 1|). 

(4) This follows from the fact that 

A+ = \jR b ; 

b>0 

note that the estimate ((^/(H 7 )!) < || (Re V) — 1 1| does not depend on 
b. □ 

Proof of Theorem \2.1[ By the Earle-Hamilton Theorem, each Rb with 
b > || (Re V)~ x || contains exactly one fixed point of (12.11) . The estimates 
for the norm and the real part of W follow from the corresponding 
estimates in parts (2) and (3) of Prop. 13. 2[ □ 
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Remark 3.3. 1) Since the application of Stieltjes inversion formula asks 
for z very close to the real axis, one might be tempted to try to solve 
(11.21) directly for real z. Of course, most of the above statements then 
break down. In particular, one should consider the map Ty for V 
with ReV^ > on the domain ReW > instead of A+. In the infi- 
nite dimensional case, those two notions are not the same, and using 
Re W > presents problems. [Ty is not even well-defined there in gen- 
eral, for ReV > 0.) In the finite dimensional case, however, KeW > 
just says that all eigenvalues of Re W are positive and different from 
zero, thus Re W is positive and invertible, thus W G A + and one can 
extend some of the above reasoning to the set {W G A \ ReW > 0}. 
This domain is mapped, under Ty into itself, but now of course not 
necessarily strictly. This implies (see Proposition 6 in [5] ) that, we still 
get a contraction in the Caratheodory metric, but the contraction con- 
stant is not necessarily less than 1. We now state this with formulas. 
Let A be finite dimensional (i.e., some matrix algebra). Let p denote 
the Caratheodory metric on the set {W G A \ ReW > 0}. Then for 
Re V > 0, there is cy < 1 such that 

p(T v (W),T v (W))<c v p(W,W) 

if ReW > and ReW > 0. 

2) One can generalize our considerations from (ll.2p to the equation 

(3.2) G(z)=G 1 [z-r ] (G(z))}. 

The latter describes [Til [12] the operator- valued Cauchy transform of 
the sum x+s where s G B is an operator- valued semicircular element as 
before (with covariance function if) and x = x* G B is an element which 
is free from s with respect to the conditional expectation E : B — > A; 

Gi{z) :=E[—) 
z — x 

is the given operator- valued Cauchy transform of x. Note that for x = 
we have G\(z) = 1/z and (13.21) reduces to the fixed point version of 
(11.21) . In the scalar- valued case A = C, equation (13.21) was derived by 
Pastur [7], describing a 'deformed semicircle'. 

The same arguments as before show that for any fixed z with positive 
imaginary part there exists exactly one solution of (13.21) whose imagi- 
nary part is strictly negative. This unique solution can be obtained as 
the limit of iterates of the mapping G \— > G\[z — tj(G)], for any initial 
G with strictly negative imaginary part. 
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4. Numerical considerations 

4.1. Iteration and averaging. Let us come back to our motivating 
equation (jl.2p for the operator-valued Cauchy transform G(z) of an 
operator- valued semicircular element. According to Theorem 12.11 we 
can get the wanted solution G = —iW of Eq. (11.21) by iterating the 
mapping W i— > J-'ziW) := [— izI+r/iW)]' 1 , starting with any W G A + . 
Even though this seems to completely solve our problem, it turns out 
that numerically this might not work too well. Namely, since we want to 
invoke the Stieltjes inversion formula to recover the wanted eigenvalue 
distribution from G(z) we need z very close to the real axis. Then 
T z is still a contraction (in the Caratheodory metric), the contraction 
constant, however, might be very close to 1, and the convergence could 
be extremely slow. In typical numerical examples, there was, for z very 
close to the real axis, no numerically observable convergence to a fixed 
point at all. (In some cases it looked as if T z would have a limiting 
2-cycle.) To improve on this, we invoked also an averaging procedure 
along with our iteration. In the simplest case, we replaced the iteration 
algorithm 

W i-> F Z (W) 

by 

W»g g (W) : =±W + ±F g (W). 

Note that Q z has essentially the same properties as T z \ in particular, 
the fixed points are the same, and Q z is a bounded holomorphic map on 
Rb and takes Rb strictly into its interior, as follows from the following 
lemma. 

Proposition 4.1. Let T> be a convex and bounded domain in a Banach 
space X and T :T> —>T> a bounded holomorphic function. Assume that 
T(T>) lies strictly inside T>. Put 

Q{w)--=\w + \nw). 

Then Q : D —>■ T> is bounded and holomorphic and maps D strictly into 
itself. Thus Q iterations applied to any W° inside T> converge to a fixed 
point for both Q and T . 

Proof. Only the fact that Q(T>) lies strictly inside T> is not directly 
clear. Let e > be such that B e {F(W)) C V for all W e V. We claim 
that B £ / 2 (Q(W)) C V for all W G V. To see this assume there is a 
W G V and an r G X with ||r|| < e/2 such that Q(W) + r <£ V. But 
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we know that J-'iW) + 2r e V and thus, since V is convex, also 
Q(W) + r = ~{F(W) + 2r) + -W £ P. 

The convergence of Q iterates, follows immediately from the Earle- 
Hamilton Theorem. □ 

As stated iterations of Q z must converge to the same fixed point as it- 
erations of T z . Numerically, in all our considered reasonable examples, 
the speed of convergence for the averaged iteration was substantially 
faster than for the plain iterations and allowed a fast numerical deter- 
mination of the desired fixed point. 

Let us, however, point out that there is no theoretical reason that 
the averaged iteration has a better contraction constant than the plain 
iteration (actually it might even be worse), and that one can produce 
artificial examples where the averaging does not improve the plain it- 
eration. For example, in the case of 2 x 2-matrices where 

rj(W) = AW A, with A = 

and for initial condition 

Wo= (o 1/2)' 

there is no clear improvement of the averaged iteration over the plain 
iteration for z close to 0. 

Of course, the usual algorithm of choice for solving equations like 
(11.21) or (12.1 p would be a kind of Newton's method applied at later 
iterations. This has much faster convergence properties (second order 
local convergence compared to first order convergence rate of our it- 
eration). However, Newton's method does in general not respect the 
positivity requirement for our solution. So even if we start in A+, it 
might happen that the Newton's method leads out of A+ and converges 
to another root which is not in A+. One might expect that iterating (or 
iterating and averaging) for a while will get in the basin of attraction 
of the correct point and then switching to the Newton's method should 
be fairly safe and fast. 

Whereas we do not have any theoretical results to justify averaging 
or a modification of the Newton's method in all cases, we want to point 
out that the uniqueness part in our Theorem allows us to check in a 
concrete example whether our algorithm of choice works or not. We 
only have to check whether the solution produced by our algorithm is 
in A + . 
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4.2. Numerical Examples. We consider the block Toeplitz random 
matrix 



(4.1) 



X 



ABC 
BAB 
C B A 



where A, B, C are independent selfadjoint N x N matrices with i.i.d. 
entries of unit variance. In [9] it was shown that, in the limit N — > oo, 
the Cauchy-transform H of the eigenvalue distribution of X is given 
by H(z) = tTs(G(z)) where tr3 denotes the normalized trace on 3 x 3- 
matrices and where the operator-valued Cauchy transform G(z) is a 
3 x 3-matrix of the form 



(4.2) 



C7 



/ h 
g 
h / 



G satisfies Eq. (jl.2p . where r\ acts as follows: 



(4.3) 



V(G) = - 



2f + g 


g + 2h 





2f+g+2h 




g + 2h 


2f + g 



In order to get the eigenvalue distribution of X we have to solve 
Eq. (11.21) for z running along the real axis. We solve (ll.2p either by 
our modified iteration procedure or by Newton's algorithm. In both 
cases we start from the center where zq is an imaginary number close 
to the origin (zq = 10 -9 • i for the following results) and we use the 
solution at each z n = zq + d ■ A ■ n as the initial point for the next 
z n+i = z + d ■ A ■ (n + 1) where A 6 1 is the resolution step size in 
calculating the spectrum and d = =Fl depending on negative or positive 
side of the spectrum. In other words, we calculate the spectrum in two 
phases both starting from the center, one for the positive side of the 
spectrum (d — 1), then for the negative side (d = —1). 



4.2.1. Solution by iteration. With the convention W(z) = iG(z) 
use 

" 1 -0.1 ■% 1 
W (z)= 1-0.1-z 

1 1-0.1-z 



we 



as the initializing matrix at center (z 



10" 



for the iteration 



method. Figure [T] depicts the calculated spectrum which matches com- 
pletely with the true spectrum [9]. 
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Figure 1. Iteration method calculation result for the 
asymptotic eigenvalue distribution of the 3x3 block 
Toeplitz matrix as in (14. ip . 



4.2.2. Failure of Newton's method. Replacing ( l4~2l) and f fl~3l) in ( TO) , 
one reaches to the following system of equations: 

1 , g (f + h) + 2(p + h*) 



(4.4) 
(4.5) 
(4.6) 



zf 



zg 



zh 



1 + 



g ( g + 2(f + h)) 



Afh + g (f + h) 



We used Newton's method to solve this system numerically, starting 
with the initial values 



" fo ' 




' 1-0.1 -i ' 


9o 




1-0.1-2 


. h ° . 




1 



for z = 10~~ 9 i on the imaginary axis and then using the fixed point 
for one z n as the initial point for the next z n+ \. The result is shown 
in Figure [2j Clearly this is not the desired result and the algorithm 
has converged to some different roots, which in this case do not yield 
a distribution function (though it is positive everywhere). In other 
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0.14p 
0.12- 
0.1- 
0.08- 
0.06- 
0.04- 
0.02- 



.5 -2 -1.5 -1 -0.5 0.5 1 1.5 2 2.5 



Figure 2. Newton's method calculation result for the 
eigenvalue distribution of the 3x3 block Toeplitz matrix 
as in fTCTj) . 

words, using Newton's method, one has to check the positivity of the 
final result to make sure that it has converged to the desired solution. 
Of course, in our example the 3x3 matrix G produced by Newton's 
algorithm is not positive. 
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